智能论文笔记

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Song Emotion Recognition: a Performance Comparison Between Audio Features and Artificial Neural Networks

Karen Rosero , Arthur Nicholas dos Santos , Pedro Benevenuto Valadares , Bruno Sanches Masiero

分类：人工智能

2022-09-24

当歌曲创作或演奏时，歌手/词曲作者通常会出现通过它表达感受或情感的意图。对于人类而言，将音乐作品或表演中的情感与观众的主观感知相匹配可能会非常具有挑战性。幸运的是，此问题的机器学习方法更简单。通常，它需要一个数据集，从该数据集中提取音频功能以将此信息呈现给数据驱动的模型，从而又将训练以预测给定歌曲与目标情绪匹配的概率是什么。在本文中，我们研究了最近出版物中最常见的功能和模型来解决此问题，揭示了哪些最适合在无伴奏歌曲中识别情感。

translated by 谷歌翻译

Intuitive Robot Programming by Capturing Human Manufacturing Skills: A Framework for the Process of Glass Adhesive Application

Mihail Babcinschi , Francisco Cruz , Nicole Duarte , Silvia Santos , Samuel Alves , Pedro Neto

分类：机器人

2022-09-15

对制造工艺的机器化的需求很大，因此单调劳动。一些需要特定技能的制造任务（焊接，绘画等）缺乏工人。机器人已在这些任务中使用，但是它们的灵活性受到限制，因为它们仍然很难通过非专家编程/重新编程，从而使它们无法访问大多数公司。机器人离线编程（OLP）是可靠的。但是，直接来自CAD/CAM的生成路径不包括代表人类技能的相关参数，例如机器人最终效应器的方向和速度。本文提出了一个直观的机器人编程系统，以捕捉人类制造技能并将其转变为机器人程序。使用连接到工作工具的磁跟踪系统记录人类熟练工人的演示。收集的数据包括工作路径的方向和速度。位置数据是从CAD/CAM中提取的，因为磁跟踪器捕获时的误差很明显。路径姿势在笛卡尔空间中转换，并在模拟环境中进行验证。生成机器人程序并将其转移到真正的机器人。关于玻璃粘合剂应用过程的实验证明了拟议框架捕获人类技能并将其转移到机器人方面的使用和有效性的直觉。

translated by 谷歌翻译

Emergent social NPC interactions in the Social NPCs Skyrim mod and beyond

Manuel Guimarães , Pedro A. Santos , Arnav Jhala

分类：人工智能

2022-07-27

这项工作介绍了一种在开放世界游戏中为非演奏世界运动而创作非玩家角色（NPC）的社会建筑模型的实施，该游戏受到基于代理建模的学术研究的启发。就丰富的对话和响应行为而言，可信的NPC创作是繁重的。我们简要介绍了为此任务使用社会代理体系结构的特征和优势，并描述了社会代理体系结构CIF-CK作为Mod Social NPC的实现

translated by 谷歌翻译

Towards Explainable Social Agent Authoring tools: A case study on FAtiMA-Toolkit

Manuel Guimarães , Joana Campos , Pedro A. Santos , João Dias , Rui Prada

分类：人工智能

2022-06-07

事实证明，在学习环境中，社会智能代理（SIA）的部署在不同的应用领域具有多个优势。社会代理创作工具使场景设计师能够创造出对SIAS行为的高度控制的量身定制体验，但是，另一方面，这是有代价的，因为该方案及其创作的复杂性可能变得霸道。在本文中，我们介绍了可解释的社会代理创作工具的概念，目的是分析社会代理的创作工具是否可以理解和解释。为此，我们检查了创作工具Fatima-Toolkit是否可以理解，并且从作者的角度来看，其创作步骤可以解释。我们进行了两项用户研究，以定量评估Fatima-Toolkit的解释性，可理解性和透明度，从场景设计师的角度来看。关键发现之一是，法蒂玛 - 库尔基特（Fatima-Toolkit）的概念模型通常是可以理解的，但是基于情感的概念并不那么容易理解和使用。尽管关于Fatima-Toolkit的解释性有一些积极的方面，但仍需要取得进展，以实现完全可以解释的社会代理商创作工具。我们提供一组关键概念和可能的解决方案，可以指导开发人员构建此类工具。

translated by 谷歌翻译

Anxolotl, an Anxiety Companion App -- Stress Detection

Nuno Gomes , Matilde Pato , Pedro Santos , André Lourenço , Lourenço Rodrigues

分类：机器学习

2022-12-28

Stress has a great effect on people's lives that can not be understated. While it can be good, since it helps humans to adapt to new and different situations, it can also be harmful when not dealt with properly, leading to chronic stress. The objective of this paper is developing a stress monitoring solution, that can be used in real life, while being able to tackle this challenge in a positive way. The SMILE data set was provided to team Anxolotl, and all it was needed was to develop a robust model. We developed a supervised learning model for classification in Python, presenting the final result of 64.1% in accuracy and a f1-score of 54.96%. The resulting solution stood the robustness test, presenting low variation between runs, which was a major point for it's possible integration in the Anxolotl app in the future.

translated by 谷歌翻译

Perplexed by Quality: A Perplexity-based Method for Adult and Harmful Content Detection in Multilingual Heterogeneous Web Data

Tim Jansen , Yangling Tong , Victoria Zevallos , Pedro Ortiz Suarez

分类：自然语言处理

2022-12-20

As demand for large corpora increases with the size of current state-of-the-art language models, using web data as the main part of the pre-training corpus for these models has become a ubiquitous practice. This, in turn, has introduced an important challenge for NLP practitioners, as they are now confronted with the task of developing highly optimized models and pipelines for pre-processing large quantities of textual data, which implies, effectively classifying and filtering multilingual, heterogeneous and noisy data, at web scale. One of the main components of this pre-processing step for the pre-training corpora of large language models, is the removal of adult and harmful content. In this paper we explore different methods for detecting adult and harmful of content in multilingual heterogeneous web data. We first show how traditional methods in harmful content detection, that seemingly perform quite well in small and specialized datasets quickly break down when confronted with heterogeneous noisy web data. We then resort to using a perplexity based approach but with a twist: Instead of using a so-called "clean" corpus to train a small language model and then use perplexity so select the documents with low perplexity, i.e., the documents that resemble this so-called "clean" corpus the most. We train solely with adult and harmful textual data, and then select the documents having a perplexity value above a given threshold. This approach will virtually cluster our documents into two distinct groups, which will greatly facilitate the choice of the threshold for the perplexity and will also allow us to obtain higher precision than with the traditional classification methods for detecting adult and harmful content.

translated by 谷歌翻译

DDIPNet and DDIPNet+: Discriminant Deep Image Prior Networks for Remote Sensing Image Classification

Daniel F. S. Santos , Rafael G. Pires , Leandro A. Passos , João P. Papa

分类：计算机视觉 | 机器学习

2022-12-20

Research on remote sensing image classification significantly impacts essential human routine tasks such as urban planning and agriculture. Nowadays, the rapid advance in technology and the availability of many high-quality remote sensing images create a demand for reliable automation methods. The current paper proposes two novel deep learning-based architectures for image classification purposes, i.e., the Discriminant Deep Image Prior Network and the Discriminant Deep Image Prior Network+, which combine Deep Image Prior and Triplet Networks learning strategies. Experiments conducted over three well-known public remote sensing image datasets achieved state-of-the-art results, evidencing the effectiveness of using deep image priors for remote sensing image classification.

translated by 谷歌翻译

Smart Face Shield: A Sensor-Based Wearable Face Shield Utilizing Computer Vision Algorithms

Manuel Luis C. Delos Santos , Ronaldo S. Tinio , Darwin B. Diaz , Karlene Emily I. Tolosa

分类：计算机视觉

2022-12-18

The study aims the development of a wearable device to combat the onslaught of covid-19. Likewise, to enhance the regular face shield available in the market. Furthermore, to raise awareness of the health and safety protocols initiated by the government and its affiliates in the enforcement of social distancing with the integration of computer vision algorithms. The wearable device was composed of various hardware and software components such as a transparent polycarbonate face shield, microprocessor, sensors, camera, thin-film transistor on-screen display, jumper wires, power bank, and python programming language. The algorithm incorporated in the study was object detection under computer vision machine learning. The front camera with OpenCV technology determines the distance of a person in front of the user. Utilizing TensorFlow, the target object identifies and detects the image or live feed to get its bounding boxes. The focal length lens requires the determination of the distance from the camera to the target object. To get the focal length, multiply the pixel width by the known distance and divide it by the known width (Rosebrock, 2020). The deployment of unit testing ensures that the parameters are valid in terms of design and specifications.

translated by 谷歌翻译

Implementing Neural Network-Based Equalizers in a Coherent Optical Transmission System Using Field-Programmable Gate Arrays

Pedro J. Freire , Sasipim Srivallapanondh , Michael Anderson , Bernhard Spinnler , Thomas Bex , Tobias A. Eriksson , Antonio Napoli , Wolfgang Schairer , Nelson Costa , Michaela Blott

分类：机器学习

2022-12-09

In this work, we demonstrate the offline FPGA realization of both recurrent and feedforward neural network (NN)-based equalizers for nonlinearity compensation in coherent optical transmission systems. First, we present a realization pipeline showing the conversion of the models from Python libraries to the FPGA chip synthesis and implementation. Then, we review the main alternatives for the hardware implementation of nonlinear activation functions. The main results are divided into three parts: a performance comparison, an analysis of how activation functions are implemented, and a report on the complexity of the hardware. The performance in Q-factor is presented for the cases of bidirectional long-short-term memory coupled with convolutional NN (biLSTM + CNN) equalizer, CNN equalizer, and standard 1-StpS digital back-propagation (DBP) for the simulation and experiment propagation of a single channel dual-polarization (SC-DP) 16QAM at 34 GBd along 17x70km of LEAF. The biLSTM+CNN equalizer provides a similar result to DBP and a 1.7 dB Q-factor gain compared with the chromatic dispersion compensation baseline in the experimental dataset. After that, we assess the Q-factor and the impact of hardware utilization when approximating the activation functions of NN using Taylor series, piecewise linear, and look-up table (LUT) approximations. We also show how to mitigate the approximation errors with extra training and provide some insights into possible gradient problems in the LUT approximation. Finally, to evaluate the complexity of hardware implementation to achieve 400G throughput, fixed-point NN-based equalizers with approximated activation functions are developed and implemented in an FPGA.

translated by 谷歌翻译